Week 3: Approximate Near Neighbor Search

نویسنده

Aleksandar Nikolov

چکیده

Last week we discussed the randomized O(n) expected time algorithm to compute the closest pair of points in the plane. This week we continue with a related data structure problem: (approximate) near neighbor search. Suppose you have a database D of n entries: images, text documents, census data, etc. Using a Dictionary data structure, for example a B-tree or a hash table, you can check if an entry x is in D. But sometimes you actually want to find an entry y in the database which is as “close” as possible to x, and not necessarily exactly the same. This is one of the most basic ways to do classification, for example: you keep a database of labeled images, and given a new image you want to see if it is close to any image in the database; if your new image is close to an image of a dog, then maybe it is also an image of a dog. Or, more simply, databases can contain errors, and looking for an approximate match is a way to make our database search more robust.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graph-based time-space trade-offs for approximate near neighbors

We take a first step towards a rigorous asymptotic analysis of graph-based approaches for finding (approximate) nearest neighbors in high-dimensional spaces, by analyzing the complexity of (randomized) greedy walks on the approximate near neighbor graph. For random data sets of size n = 2o(d) on the d-dimensional Euclidean unit sphere, using near neighbor graphs we can provably solve the approx...

متن کامل

Coding for Random Projections and Approximate Near Neighbor Search

Abstract This technical note compares two coding (quantization) schemes for random projections in the context of sub-linear time approximate near neighbor search. The first scheme is based on uniform quantization [4] while the second scheme utilizes a uniform quantization plus a uniformly random offset [1] (which has been popular in practice). The prior work [4] compared the two schemes in the ...

متن کامل

A Replacement for Voronoi Diagrams of Near Linear Size

A compressed quad tree based replacement for approximate voronoi diagrams with near linear complexity using hierarchial clustering and prioritized point location among balls and with applications for improved approximate nearest neighbour search using point location among equal balls, fat triangulations of proximity diagrams in two and higher dimensions and for fast approximate proximity search.

متن کامل

Learning Vocabulary-Based Hashing with AdaBoost

Approximate near neighbor search plays a critical role in various kinds of multimedia applications. The vocabulary-based hashing scheme uses vocabularies, i.e. selected sets of feature points, to define a hash function family. The function family can be employed to build an approximate near neighbor search index. The critical problem in vocabulary-based hashing is the criteria of choosing vocab...

متن کامل

SIMP: Accurate and Efficient Near Neighbor Search in Very High Dimensional Spaces

Near neighbor search in very high dimensional spaces is useful in many applications. Existing techniques solve this problem efficiently only for the approximate case. These solutions are designed to solve r-near neighbor queries only for a fixed query range or a set of query ranges with probabilistic guarantees and then, extended for nearest neighbor queries. Solutions supporting a set of query...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

Week 3: Approximate Near Neighbor Search

نویسنده

چکیده

منابع مشابه

Graph-based time-space trade-offs for approximate near neighbors

Coding for Random Projections and Approximate Near Neighbor Search

A Replacement for Voronoi Diagrams of Near Linear Size

Learning Vocabulary-Based Hashing with AdaBoost

SIMP: Accurate and Efficient Near Neighbor Search in Very High Dimensional Spaces

عنوان ژورنال:

اشتراک گذاری